首页> 外文OA文献 >Identifiability of phylogenetic parameters from k-mer data under the coalescent
【2h】

Identifiability of phylogenetic parameters from k-mer data under the coalescent

机译:从k-mer数据中鉴定系统发育参数   成膜助剂

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

Distances between sequences based on their $k$-mer frequency counts can beused to reconstruct phylogenies without first computing a sequence alignment.Past work has shown that effective use of k-mer methods depends on 1)model-based corrections to distances based on $k$-mers and 2) breaking longsequences into blocks to obtain repeated trials from the sequence-generatingprocess. Good performance of such methods is based on having many high-qualityblocks with many homologous sites, which can be problematic to guarantee apriori. Nature provides natural blocks of sequences into homologous regions---namely,the genes. However, directly using past work in this setting is problematicbecause of possible discordance between different gene trees and the underlyingspecies tree. Using the multispecies coalescent model as a basis, we derivemodel-based moment formulas that involve the divergence times and thecoalescent parameters. From this setting, we prove identifiability results forthe tree and branch length parameters under the Jukes-Cantor model of sequencemutations.
机译:基于序列的$ k $ -mer频率计数之间的距离可以用于重建系统发育,而无需先计算序列比对。过去的工作表明,有效利用k-mer方法取决于1)基于模型的距离校正k $ -mers和2)将长序列分解为多个块,以便从序列生成过程中进行重复试验。这种方法的良好性能是基于具有许多具有许多同源位点的高质量嵌段,这对于保证先验性可能是有问题的。大自然将天然的序列块提供到同源区域,即基因。但是,由于不同的基因树和基础物种树之间可能存在不一致,因此在这种情况下直接使用以前的工作是有问题的。以多物种聚结模型为基础,我们推导了基于模型的矩公式,其中包含了发散时间和聚结参数。通过此设置,我们证明了在Jukes-Cantor序列突变模型下树和分支长度参数的可识别性结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号